Chapter 20: Building and Modifying Trees
Tree Insertion: Building from the Ground Up
The Reference Problem: Building a Binary Search Tree
We'll anchor this entire chapter around one realistic problem: building a file system index that maintains files in sorted order for fast lookup. This is the kind of structure used by databases, file systems, and search engines.
Our reference implementation will be a Binary Search Tree (BST) that stores file paths with their sizes. We'll start with insertion, then progressively add deletion, construction from traversals, and tree transformations.
The Domain Context
Imagine you're building a file indexer that needs to: - Insert files as they're discovered during directory traversal - Maintain sorted order for fast searching - Support deletion when files are removed - Clone the tree for backup purposes - Mirror the tree for certain display operations
Let's start with the simplest possible tree structure and build up from there.
Phase 1: The Naive Insertion Attempt
Here's our initial tree node structure and a first attempt at insertion:
class TreeNode:
"""A node in our binary search tree."""
def __init__(self, filepath, size):
self.filepath = filepath # The file path (our key)
self.size = size # File size in bytes
self.left = None # Left child (smaller paths)
self.right = None # Right child (larger paths)
def __repr__(self):
return f"TreeNode('{self.filepath}', {self.size})"
def insert_file(root, filepath, size):
"""
Attempt 1: Insert a file into the BST.
This version has a critical flaw - can you spot it?
"""
# Base case: empty tree
if root is None:
return TreeNode(filepath, size)
# Recursive case: navigate to correct position
if filepath < root.filepath:
root.left = insert_file(root.left, filepath, size)
else:
root.right = insert_file(root.right, filepath, size)
return root
# Test the insertion
root = None
root = insert_file(root, "/home/user/docs/report.txt", 1024)
root = insert_file(root, "/home/user/docs/notes.txt", 512)
root = insert_file(root, "/home/user/pics/photo.jpg", 2048)
print("Root:", root)
print("Left child:", root.left)
print("Right child:", root.right)
Python Output:
Root: TreeNode('/home/user/docs/report.txt', 1024)
Left child: TreeNode('/home/user/docs/notes.txt', 512)
Right child: TreeNode('/home/user/pics/photo.jpg', 2048)
This works! But let's test a more complex scenario...
# What happens with duplicate insertions?
root = None
root = insert_file(root, "/home/user/file.txt", 100)
root = insert_file(root, "/home/user/file.txt", 200) # Same path, different size
print("Root:", root)
print("Right child:", root.right)
Python Output:
Root: TreeNode('/home/user/file.txt', 100)
Right child: TreeNode('/home/user/file.txt', 200)
Diagnostic Analysis: Understanding the Duplicate Problem
The issue: Our tree now contains two nodes with the same filepath. This violates the BST property and creates ambiguityβwhich file size is correct?
Let's parse what happened:
- First insertion: Created root with filepath="/home/user/file.txt", size=100
- Second insertion:
- Compared "/home/user/file.txt" < "/home/user/file.txt" β False
- Went right (our
elsebranch handles>=) -
Created new node with same filepath but size=200
-
The root cause: Our comparison uses
<andelse, which means equal keys go right. We never check for equality to update existing nodes.
What we need: A way to detect duplicates and decide what to doβupdate the existing node or reject the insertion.
Iteration 1: Handling Duplicates
Let's fix this by explicitly handling the equality case:
def insert_file_v2(root, filepath, size):
"""
Iteration 1: Handle duplicate filepaths by updating size.
When we encounter a file that already exists, we update its size
rather than creating a duplicate node.
"""
# Base case: empty tree or reached insertion point
if root is None:
return TreeNode(filepath, size)
# Recursive cases with explicit equality check
if filepath < root.filepath:
root.left = insert_file_v2(root.left, filepath, size)
elif filepath > root.filepath:
root.right = insert_file_v2(root.right, filepath, size)
else:
# filepath == root.filepath: update existing node
root.size = size # Update the file size
return root
# Test duplicate handling
root = None
root = insert_file_v2(root, "/home/user/file.txt", 100)
print("After first insert:", root)
root = insert_file_v2(root, "/home/user/file.txt", 200)
print("After duplicate insert:", root)
print("Right child:", root.right) # Should be None
Python Output:
After first insert: TreeNode('/home/user/file.txt', 100)
After duplicate insert: TreeNode('/home/user/file.txt', 200)
Right child: None
Improvement: Now duplicates update the existing node instead of creating a new one. The tree maintains the BST invariant.
Verification with a larger tree:
# Build a more realistic file tree
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
("/home/user/docs/report.txt", 1536), # Update report size
("/home/user/code/main.py", 768),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
# Verify the structure with in-order traversal
def inorder_traversal(node):
"""Print tree in sorted order."""
if node is None:
return
inorder_traversal(node.left)
print(f" {node.filepath}: {node.size} bytes")
inorder_traversal(node.right)
print("File tree (in-order):")
inorder_traversal(root)
Python Output:
File tree (in-order):
/home/user/code/main.py: 768 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1536 bytes
/home/user/pics/photo.jpg: 2048 bytes
Key observation: The files are printed in sorted order, and the duplicate report.txt shows the updated size (1536, not 1024). Our insertion now works correctly.
Call Stack Visualization: Understanding Recursive Insertion
Let's trace how insert_file_v2(root, "/home/user/code/main.py", 768) works when the tree already contains "/home/user/docs/report.txt" as root:
def insert_file_traced(root, filepath, size, depth=0):
"""Version with execution tracing."""
indent = " " * depth
print(f"{indent}insert_file({filepath}, size={size})")
if root is None:
print(f"{indent}β Base case: creating new node")
return TreeNode(filepath, size)
print(f"{indent}β Current node: {root.filepath}")
if filepath < root.filepath:
print(f"{indent}β Going LEFT ('{filepath}' < '{root.filepath}')")
root.left = insert_file_traced(root.left, filepath, size, depth + 1)
elif filepath > root.filepath:
print(f"{indent}β Going RIGHT ('{filepath}' > '{root.filepath}')")
root.right = insert_file_traced(root.right, filepath, size, depth + 1)
else:
print(f"{indent}β DUPLICATE: updating size from {root.size} to {size}")
root.size = size
print(f"{indent}β Returning node: {root.filepath}")
return root
# Trace a single insertion
root = TreeNode("/home/user/docs/report.txt", 1024)
root.left = TreeNode("/home/user/docs/notes.txt", 512)
root.right = TreeNode("/home/user/pics/photo.jpg", 2048)
print("Inserting /home/user/code/main.py into existing tree:\n")
root = insert_file_traced(root, "/home/user/code/main.py", 768)
Execution Trace:
Inserting /home/user/code/main.py into existing tree:
insert_file(/home/user/code/main.py, size=768)
β Current node: /home/user/docs/report.txt
β Going LEFT ('/home/user/code/main.py' < '/home/user/docs/report.txt')
insert_file(/home/user/code/main.py, size=768)
β Current node: /home/user/docs/notes.txt
β Going LEFT ('/home/user/code/main.py' < '/home/user/docs/notes.txt')
insert_file(/home/user/code/main.py, size=768)
β Base case: creating new node
β Returning node: /home/user/docs/notes.txt
β Returning node: /home/user/docs/report.txt
The recursive journey: 1. Start at root (/home/user/docs/report.txt) 2. Compare: "/home/user/code/main.py" < "/home/user/docs/report.txt" β go left 3. At left child (/home/user/docs/notes.txt) 4. Compare: "/home/user/code/main.py" < "/home/user/docs/notes.txt" β go left 5. Left child is None β base case, create new node 6. Return up the call stack, updating left pointers
Critical insight: Each recursive call navigates one level deeper, and the return values rebuild the tree structure with the new node attached.
Limitation Preview
Our insertion works, but we still need to address: - Deletion: How do we remove nodes while maintaining BST properties? - Bulk construction: Building a tree from a list of traversal results - Tree transformations: Cloning and mirroring for different use cases
Let's tackle deletion next, which is significantly more complex.
Tree Deletion: The Three-Case Challenge
Iteration 2: Deleting Nodes from the BST
Deletion is where tree manipulation gets interesting. Unlike insertion, which always adds a leaf, deletion must handle three distinct cases based on the node's children.
The Three Deletion Cases
- Leaf node (no children): Simply remove it
- One child: Replace node with its child
- Two children: Find successor, replace value, delete successor
Let's start with a naive approach and discover why each case needs special handling.
Naive Deletion Attempt
def delete_file_naive(root, filepath):
"""
Naive deletion attempt - only handles leaf nodes correctly.
This will fail for nodes with children. Let's see how.
"""
# Base case: empty tree or file not found
if root is None:
return None
# Recursive navigation
if filepath < root.filepath:
root.left = delete_file_naive(root.left, filepath)
return root
elif filepath > root.filepath:
root.right = delete_file_naive(root.right, filepath)
return root
else:
# Found the node to delete
# Naive approach: just return None
return None
# Build a test tree
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
("/home/user/code/main.py", 768),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
print("Original tree (in-order):")
inorder_traversal(root)
# Try deleting a leaf node (should work)
print("\nDeleting leaf node: /home/user/code/main.py")
root = delete_file_naive(root, "/home/user/code/main.py")
inorder_traversal(root)
# Try deleting a node with children (will break)
print("\nDeleting node with children: /home/user/docs/report.txt")
root = delete_file_naive(root, "/home/user/docs/report.txt")
inorder_traversal(root)
Python Output:
Original tree (in-order):
/home/user/code/main.py: 768 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
Deleting leaf node: /home/user/code/main.py
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
Deleting node with children: /home/user/docs/report.txt
/home/user/pics/photo.jpg: 2048 bytes
Diagnostic Analysis: The Lost Subtree Problem
What went wrong: When we deleted "/home/user/docs/report.txt", we lost its entire left subtree! The file "/home/user/docs/notes.txt" disappeared.
Let's parse the failure:
- The deletion: We found report.txt (the root) and returned None
- The consequence: The parent's pointer to this node became None
- The lost data: report.txt had a left child (notes.txt), but we discarded it
Visual representation of what happened:
Before deletion:
report.txt
/ \
notes.txt photo.jpg
After naive deletion (returned None):
None
/ \
? ?
Result: Lost notes.txt entirely!
Root cause: Returning None works for leaf nodes, but for nodes with children, we need to preserve the subtree structure.
What we need: Logic to handle each of the three cases differently.
Iteration 2: Proper Three-Case Deletion
def delete_file(root, filepath):
"""
Iteration 2: Proper deletion handling all three cases.
Case 1: Leaf node (no children) β return None
Case 2: One child β return that child
Case 3: Two children β find in-order successor, replace, delete successor
"""
# Base case: empty tree or file not found
if root is None:
return None
# Recursive navigation to find the node
if filepath < root.filepath:
root.left = delete_file(root.left, filepath)
return root
elif filepath > root.filepath:
root.right = delete_file(root.right, filepath)
return root
# Found the node to delete (filepath == root.filepath)
# Now handle the three cases
# Case 1: Leaf node (no children)
if root.left is None and root.right is None:
return None
# Case 2a: Only right child
if root.left is None:
return root.right
# Case 2b: Only left child
if root.right is None:
return root.left
# Case 3: Two children
# Strategy: Find in-order successor (smallest node in right subtree)
# Replace current node's data with successor's data
# Delete the successor (which will be Case 1 or Case 2a)
# Find the in-order successor (leftmost node in right subtree)
successor = find_min(root.right)
# Replace current node's data with successor's data
root.filepath = successor.filepath
root.size = successor.size
# Delete the successor from the right subtree
root.right = delete_file(root.right, successor.filepath)
return root
def find_min(node):
"""Find the minimum node in a subtree (leftmost node)."""
current = node
while current.left is not None:
current = current.left
return current
# Rebuild the test tree
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
("/home/user/code/main.py", 768),
("/home/user/docs/draft.txt", 256),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
print("Original tree (in-order):")
inorder_traversal(root)
# Test Case 1: Delete leaf node
print("\n--- Case 1: Deleting leaf node /home/user/code/main.py ---")
root = delete_file(root, "/home/user/code/main.py")
inorder_traversal(root)
# Test Case 2: Delete node with one child
print("\n--- Case 2: Deleting node with one child /home/user/docs/notes.txt ---")
root = delete_file(root, "/home/user/docs/notes.txt")
inorder_traversal(root)
# Test Case 3: Delete node with two children
print("\n--- Case 3: Deleting node with two children /home/user/docs/report.txt ---")
root = delete_file(root, "/home/user/docs/report.txt")
inorder_traversal(root)
Python Output:
Original tree (in-order):
/home/user/code/main.py: 768 bytes
/home/user/docs/draft.txt: 256 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
--- Case 1: Deleting leaf node /home/user/code/main.py ---
/home/user/docs/draft.txt: 256 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
--- Case 2: Deleting node with one child /home/user/docs/notes.txt ---
/home/user/docs/draft.txt: 256 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
--- Case 3: Deleting node with two children /home/user/docs/report.txt ---
/home/user/docs/draft.txt: 256 bytes
/home/user/pics/photo.jpg: 2048 bytes
Improvement: All three cases now work correctly! The tree maintains its BST property after each deletion.
Deep Dive: Why the In-Order Successor?
When deleting a node with two children, we need a replacement that maintains the BST property. The in-order successor (smallest node in the right subtree) is perfect because:
- It's larger than everything in the left subtree (BST property preserved)
- It's smaller than everything else in the right subtree (BST property preserved)
- It has at most one child (the right child), making its deletion simple (Case 1 or 2a)
Visual example:
Delete node 50 (has two children):
50
/ \
30 70
/ \ / \
20 40 60 80
In-order successor of 50 is 60 (leftmost in right subtree)
Step 1: Replace 50's data with 60's data:
60
/ \
30 70
/ \ / \
20 40 60 80
Step 2: Delete the duplicate 60 (now a leaf):
60
/ \
30 70
/ \ \
20 40 80
Call Stack Trace: Deleting a Two-Child Node
Let's trace the deletion of a node with two children to see the recursive flow:
def delete_file_traced(root, filepath, depth=0):
"""Deletion with execution tracing."""
indent = " " * depth
if root is None:
print(f"{indent}delete_file({filepath}): node not found")
return None
print(f"{indent}delete_file({filepath}) at node {root.filepath}")
if filepath < root.filepath:
print(f"{indent}β Going LEFT")
root.left = delete_file_traced(root.left, filepath, depth + 1)
return root
elif filepath > root.filepath:
print(f"{indent}β Going RIGHT")
root.right = delete_file_traced(root.right, filepath, depth + 1)
return root
# Found the node to delete
print(f"{indent}β FOUND node to delete: {root.filepath}")
# Check cases
if root.left is None and root.right is None:
print(f"{indent}β Case 1: Leaf node - returning None")
return None
if root.left is None:
print(f"{indent}β Case 2a: Only right child - returning right child")
return root.right
if root.right is None:
print(f"{indent}β Case 2b: Only left child - returning left child")
return root.left
# Case 3: Two children
print(f"{indent}β Case 3: Two children")
successor = find_min(root.right)
print(f"{indent}β In-order successor: {successor.filepath}")
print(f"{indent}β Replacing {root.filepath} with {successor.filepath}")
root.filepath = successor.filepath
root.size = successor.size
print(f"{indent}β Now deleting successor from right subtree")
root.right = delete_file_traced(root.right, successor.filepath, depth + 1)
return root
# Build a tree and trace a two-child deletion
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
("/home/user/docs/draft.txt", 256),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
print("Tree structure before deletion:")
inorder_traversal(root)
print("\n--- Tracing deletion of /home/user/docs/report.txt (two children) ---\n")
root = delete_file_traced(root, "/home/user/docs/report.txt")
print("\nTree structure after deletion:")
inorder_traversal(root)
Execution Trace:
Tree structure before deletion:
/home/user/docs/draft.txt: 256 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
--- Tracing deletion of /home/user/docs/report.txt (two children) ---
delete_file(/home/user/docs/report.txt) at node /home/user/docs/report.txt
β FOUND node to delete: /home/user/docs/report.txt
β Case 3: Two children
β In-order successor: /home/user/pics/photo.jpg
β Replacing /home/user/docs/report.txt with /home/user/pics/photo.jpg
β Now deleting successor from right subtree
delete_file(/home/user/pics/photo.jpg) at node /home/user/pics/photo.jpg
β FOUND node to delete: /home/user/pics/photo.jpg
β Case 1: Leaf node - returning None
Tree structure after deletion:
/home/user/docs/draft.txt: 256 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/pics/photo.jpg: 2048 bytes
The recursive journey: 1. Find the node to delete (report.txt) 2. Identify it has two children (Case 3) 3. Find in-order successor (photo.jpg) 4. Replace report.txt's data with photo.jpg's data 5. Recursively delete the successor from right subtree 6. Successor deletion is Case 1 (leaf node), returns None 7. Tree structure updated, BST property maintained
Complexity Analysis
Time Complexity: O(h) where h is tree height - Navigation to node: O(h) - Finding successor: O(h) in worst case - Total: O(h)
Space Complexity: O(h) - Call stack depth: h (one call per level) - No additional data structures
Best case: Balanced tree, h = log(n), so O(log n) Worst case: Skewed tree, h = n, so O(n)
When to Apply This Solution
What it optimizes for: - Maintains BST property after deletion - Handles all three cases correctly - Preserves tree structure
What it sacrifices: - More complex than insertion - Requires finding successor for two-child case - Multiple recursive calls for Case 3
When to choose this approach: - You need to maintain a dynamic sorted collection - Deletions are relatively infrequent - Tree is reasonably balanced
When to avoid this approach: - Frequent deletions on skewed trees (consider self-balancing trees) - Simple list operations would suffice - Memory is extremely constrained
Limitation Preview
We can now insert and delete, but what if we need to: - Build a tree from a list of traversal results (e.g., from a serialized format)? - Construct a tree from preorder and inorder traversals?
Let's tackle tree construction next.
Tree Construction from Traversals
Iteration 3: Reconstructing Trees from Traversal Data
Sometimes we need to rebuild a tree from its traversal sequences. This is common when: - Deserializing trees from storage - Reconstructing trees from network data - Converting between tree representations
The key insight: Different traversal combinations provide different information about tree structure.
The Reconstruction Problem
Given traversal sequences, can we uniquely reconstruct the original tree?
Critical fact: - Preorder + Inorder β Unique tree β - Postorder + Inorder β Unique tree β - Preorder + Postorder β NOT unique (without additional info) β
Let's see why and how to implement reconstruction.
Understanding the Information in Traversals
Preorder (Root β Left β Right): Tells us the root of each subtree Inorder (Left β Root β Right): Tells us what's in the left vs right subtree Postorder (Left β Right β Root): Tells us the root, but last
Example tree:
50
/ \
30 70
/ \ / \
20 40 60 80
Traversals: - Preorder: [50, 30, 20, 40, 70, 60, 80] - Inorder: [20, 30, 40, 50, 60, 70, 80] - Postorder: [20, 40, 30, 60, 80, 70, 50]
Naive Reconstruction Attempt
def build_tree_naive(preorder, inorder):
"""
Naive attempt: Build tree from preorder and inorder traversals.
This version has inefficiencies - can you spot them?
"""
# Base case: empty tree
if not preorder or not inorder:
return None
# First element of preorder is the root
root_val = preorder[0]
root = TreeNode(root_val, 0) # Size doesn't matter for this example
# Find root in inorder to split left and right subtrees
root_index = inorder.index(root_val)
# Elements before root in inorder are left subtree
left_inorder = inorder[:root_index]
# Elements after root in inorder are right subtree
right_inorder = inorder[root_index + 1:]
# Split preorder accordingly
# Left subtree has same number of elements as left_inorder
left_preorder = preorder[1:1 + len(left_inorder)]
# Right subtree gets the rest
right_preorder = preorder[1 + len(left_inorder):]
# Recursively build left and right subtrees
root.left = build_tree_naive(left_preorder, left_inorder)
root.right = build_tree_naive(right_preorder, right_inorder)
return root
# Test with example tree
preorder = [50, 30, 20, 40, 70, 60, 80]
inorder = [20, 30, 40, 50, 60, 70, 80]
print("Building tree from traversals:")
print(f"Preorder: {preorder}")
print(f"Inorder: {inorder}")
root = build_tree_naive(preorder, inorder)
print("\nReconstructed tree (inorder verification):")
def print_tree_values(node):
if node is None:
return
print_tree_values(node.left)
print(f" {node.filepath}")
print_tree_values(node.right)
print_tree_values(root)
Python Output:
Building tree from traversals:
Preorder: [50, 30, 20, 40, 70, 60, 80]
Inorder: [20, 30, 40, 50, 60, 70, 80]
Reconstructed tree (inorder verification):
20
30
40
50
60
70
80
Success! The tree is reconstructed correctly. But let's analyze the performance...
Diagnostic Analysis: The Hidden Inefficiency
The problem: The inorder.index(root_val) call is O(n) for each node, making the overall complexity O(nΒ²).
Let's trace the work done:
Level 1: Search for 50 in [20, 30, 40, 50, 60, 70, 80] β 7 comparisons
Level 2: Search for 30 in [20, 30, 40] β 3 comparisons
Search for 70 in [60, 70, 80] β 3 comparisons
Level 3: Search for 20 in [20] β 1 comparison
Search for 40 in [40] β 1 comparison
Search for 60 in [60] β 1 comparison
Search for 80 in [80] β 1 comparison
Total: 7 + 3 + 3 + 1 + 1 + 1 + 1 = 17 comparisons for 7 nodes
Root cause: We're doing linear search for every node's position in the inorder array.
What we need: A way to find the root's index in O(1) time.
Iteration 3: Optimized Reconstruction with Index Map
def build_tree_optimized(preorder, inorder):
"""
Iteration 3: Optimized tree construction using index map.
Key optimization: Build a hash map of inorder indices for O(1) lookup.
"""
# Build index map for O(1) lookup of root position
inorder_map = {val: idx for idx, val in enumerate(inorder)}
def build_subtree(pre_start, pre_end, in_start, in_end):
"""
Build subtree from preorder[pre_start:pre_end] and inorder[in_start:in_end].
Using indices instead of slicing avoids creating new lists.
"""
# Base case: empty range
if pre_start > pre_end:
return None
# Root is first element in preorder range
root_val = preorder[pre_start]
root = TreeNode(root_val, 0)
# Find root position in inorder using map (O(1))
root_index = inorder_map[root_val]
# Calculate size of left subtree
left_size = root_index - in_start
# Recursively build left subtree
# Preorder: [pre_start+1 ... pre_start+left_size]
# Inorder: [in_start ... root_index-1]
root.left = build_subtree(
pre_start + 1,
pre_start + left_size,
in_start,
root_index - 1
)
# Recursively build right subtree
# Preorder: [pre_start+left_size+1 ... pre_end]
# Inorder: [root_index+1 ... in_end]
root.right = build_subtree(
pre_start + left_size + 1,
pre_end,
root_index + 1,
in_end
)
return root
return build_subtree(0, len(preorder) - 1, 0, len(inorder) - 1)
# Test with the same example
preorder = [50, 30, 20, 40, 70, 60, 80]
inorder = [20, 30, 40, 50, 60, 70, 80]
print("Building tree with optimized approach:")
root = build_tree_optimized(preorder, inorder)
print("Reconstructed tree (inorder verification):")
print_tree_values(root)
# Verify structure with level-order traversal
from collections import deque
def level_order(root):
"""Print tree level by level."""
if not root:
return
queue = deque([root])
while queue:
level_size = len(queue)
level_vals = []
for _ in range(level_size):
node = queue.popleft()
level_vals.append(node.filepath)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
print(f" Level: {level_vals}")
print("\nLevel-order traversal:")
level_order(root)
Python Output:
Building tree with optimized approach:
Reconstructed tree (inorder verification):
20
30
40
50
60
70
80
Level-order traversal:
Level: [50]
Level: [30, 70]
Level: [20, 40, 60, 80]
Improvement: Same correct result, but now with O(n) time complexity instead of O(nΒ²).
Call Stack Visualization: Recursive Reconstruction
Let's trace how the optimized version builds the tree:
def build_tree_traced(preorder, inorder):
"""Version with execution tracing."""
inorder_map = {val: idx for idx, val in enumerate(inorder)}
def build_subtree(pre_start, pre_end, in_start, in_end, depth=0):
indent = " " * depth
if pre_start > pre_end:
print(f"{indent}Empty range β None")
return None
root_val = preorder[pre_start]
print(f"{indent}Building subtree with root={root_val}")
print(f"{indent} Preorder range: [{pre_start}:{pre_end}] = {preorder[pre_start:pre_end+1]}")
print(f"{indent} Inorder range: [{in_start}:{in_end}] = {inorder[in_start:in_end+1]}")
root = TreeNode(root_val, 0)
root_index = inorder_map[root_val]
left_size = root_index - in_start
print(f"{indent} Root index in inorder: {root_index}")
print(f"{indent} Left subtree size: {left_size}")
print(f"{indent} Building LEFT subtree:")
root.left = build_subtree(
pre_start + 1,
pre_start + left_size,
in_start,
root_index - 1,
depth + 1
)
print(f"{indent} Building RIGHT subtree:")
root.right = build_subtree(
pre_start + left_size + 1,
pre_end,
root_index + 1,
in_end,
depth + 1
)
print(f"{indent}β Returning node {root_val}")
return root
return build_subtree(0, len(preorder) - 1, 0, len(inorder) - 1)
# Trace with a smaller example for clarity
preorder = [50, 30, 20, 70]
inorder = [20, 30, 50, 70]
print("Tracing tree construction:\n")
root = build_tree_traced(preorder, inorder)
Execution Trace:
Tracing tree construction:
Building subtree with root=50
Preorder range: [0:3] = [50, 30, 20, 70]
Inorder range: [0:3] = [20, 30, 50, 70]
Root index in inorder: 2
Left subtree size: 2
Building LEFT subtree:
Building subtree with root=30
Preorder range: [1:2] = [30, 20]
Inorder range: [0:1] = [20, 30]
Root index in inorder: 1
Left subtree size: 1
Building LEFT subtree:
Building subtree with root=20
Preorder range: [2:2] = [20]
Inorder range: [0:0] = [20]
Root index in inorder: 0
Left subtree size: 0
Building LEFT subtree:
Empty range β None
Building RIGHT subtree:
Empty range β None
β Returning node 20
Building RIGHT subtree:
Empty range β None
β Returning node 30
Building RIGHT subtree:
Building subtree with root=70
Preorder range: [3:3] = [70]
Inorder range: [3:3] = [70]
Root index in inorder: 3
Left subtree size: 0
Building LEFT subtree:
Empty range β None
Building RIGHT subtree:
Empty range β None
β Returning node 70
β Returning node 50
The recursive journey: 1. Start with full ranges for both traversals 2. Root is first element of preorder (50) 3. Find root in inorder (index 2) to determine left/right split 4. Left subtree: preorder[1:2], inorder[0:1] β recursively build 5. Right subtree: preorder[3:3], inorder[3:3] β recursively build 6. Each recursive call follows the same pattern until base case
Complexity Analysis
Time Complexity: O(n) - Building index map: O(n) - Each node visited once: O(n) - Index lookup: O(1) per node - Total: O(n)
Space Complexity: O(n) - Index map: O(n) - Call stack depth: O(h) where h β€ n - Total: O(n)
Recurrence Relation: T(n) = T(k) + T(n-k-1) + O(1) - Where k is the size of left subtree - Solves to O(n) because each node is processed once
When to Apply This Solution
What it optimizes for: - Reconstructing trees from serialized data - Converting between tree representations - Validating tree structure
What it sacrifices: - Requires both preorder and inorder (or postorder and inorder) - Extra space for index map - More complex than simple insertion
When to choose this approach: - Deserializing trees from storage - Reconstructing trees from traversal logs - Converting between tree formats
When to avoid this approach: - Building trees incrementally (use insertion instead) - Only one traversal available - Memory is extremely constrained
Alternative: Postorder + Inorder Construction
The same principle works with postorder and inorder:
def build_tree_from_postorder(postorder, inorder):
"""
Build tree from postorder and inorder traversals.
Key difference: Root is LAST element of postorder.
"""
inorder_map = {val: idx for idx, val in enumerate(inorder)}
def build_subtree(post_start, post_end, in_start, in_end):
if post_start > post_end:
return None
# Root is LAST element in postorder range
root_val = postorder[post_end]
root = TreeNode(root_val, 0)
root_index = inorder_map[root_val]
left_size = root_index - in_start
# Build left subtree
root.left = build_subtree(
post_start,
post_start + left_size - 1,
in_start,
root_index - 1
)
# Build right subtree
root.right = build_subtree(
post_start + left_size,
post_end - 1,
root_index + 1,
in_end
)
return root
return build_subtree(0, len(postorder) - 1, 0, len(inorder) - 1)
# Test with postorder
postorder = [20, 40, 30, 60, 80, 70, 50]
inorder = [20, 30, 40, 50, 60, 70, 80]
print("Building tree from postorder and inorder:")
print(f"Postorder: {postorder}")
print(f"Inorder: {inorder}")
root = build_tree_from_postorder(postorder, inorder)
print("\nReconstructed tree (inorder verification):")
print_tree_values(root)
Python Output:
Building tree from postorder and inorder:
Postorder: [20, 40, 30, 60, 80, 70, 50]
Inorder: [20, 30, 40, 50, 60, 70, 80]
Reconstructed tree (inorder verification):
20
30
40
50
60
70
80
Key difference: With postorder, the root is the last element instead of the first, so we work backwards through the postorder array.
Limitation Preview
We can now build trees from traversals, but what about transforming existing trees? Let's explore: - Cloning: Creating a deep copy of a tree - Mirroring: Flipping the tree horizontally
These operations are fundamental for tree manipulation and will complete our toolkit.
Tree Transformations: Cloning and Mirroring
Iteration 4: Cloning Trees
Cloning a tree means creating a complete, independent copy. This is essential when you need to: - Preserve the original tree while experimenting with modifications - Create backups before destructive operations - Implement undo/redo functionality
The Shallow Copy Problem
Let's first see what happens with a naive copy attempt:
# Build a sample tree
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
print("Original tree:")
inorder_traversal(root)
# Naive copy attempt
copy_root = root # This is just a reference!
# Modify the "copy"
copy_root.size = 9999
print("\nAfter modifying 'copy':")
print(f"Original root size: {root.size}")
print(f"Copy root size: {copy_root.size}")
Python Output:
Original tree:
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
After modifying 'copy':
Original root size: 9999
Copy root size: 9999
Diagnostic Analysis: The Reference Problem
What went wrong: copy_root = root doesn't create a new treeβit just creates another reference to the same tree object.
Visual representation:
Before "copy":
root β TreeNode(report.txt, 1024)
After copy_root = root:
root β
TreeNode(report.txt, 1024)
copy_root β
Both variables point to the SAME object!
Root cause: Python's assignment creates references, not copies. We need to recursively create new node objects.
What we need: A deep copy that creates new nodes for every node in the original tree.
Iteration 4: Recursive Deep Clone
def clone_tree(node):
"""
Create a deep copy of a tree.
Every node in the new tree is a separate object with the same values.
"""
# Base case: empty tree
if node is None:
return None
# Create a new node with the same data
new_node = TreeNode(node.filepath, node.size)
# Recursively clone left and right subtrees
new_node.left = clone_tree(node.left)
new_node.right = clone_tree(node.right)
return new_node
# Build original tree
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
print("Original tree:")
inorder_traversal(root)
# Create a proper clone
cloned_root = clone_tree(root)
print("\nCloned tree:")
inorder_traversal(cloned_root)
# Modify the clone
cloned_root.size = 9999
print("\nAfter modifying clone:")
print(f"Original root size: {root.size}")
print(f"Cloned root size: {cloned_root.size}")
# Verify they're different objects
print(f"\nAre they the same object? {root is cloned_root}")
print(f"Original root id: {id(root)}")
print(f"Cloned root id: {id(cloned_root)}")
Python Output:
Original tree:
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
Cloned tree:
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
After modifying clone:
Original root size: 1024
Cloned root size: 9999
Are they the same object? False
Original root id: 140234567890123
Cloned root id: 140234567890456
Improvement: Now the clone is truly independent. Modifying one tree doesn't affect the other.
Call Stack Visualization: Cloning Process
Let's trace how cloning works recursively:
def clone_tree_traced(node, depth=0):
"""Clone with execution tracing."""
indent = " " * depth
if node is None:
print(f"{indent}clone_tree(None) β None")
return None
print(f"{indent}clone_tree({node.filepath})")
print(f"{indent} Creating new node: {node.filepath}")
new_node = TreeNode(node.filepath, node.size)
print(f"{indent} Cloning LEFT subtree:")
new_node.left = clone_tree_traced(node.left, depth + 1)
print(f"{indent} Cloning RIGHT subtree:")
new_node.right = clone_tree_traced(node.right, depth + 1)
print(f"{indent}β Returning cloned node: {node.filepath}")
return new_node
# Build a small tree for tracing
root = TreeNode("/home/user/docs/report.txt", 1024)
root.left = TreeNode("/home/user/docs/notes.txt", 512)
root.right = TreeNode("/home/user/pics/photo.jpg", 2048)
print("Tracing clone operation:\n")
cloned = clone_tree_traced(root)
Execution Trace:
Tracing clone operation:
clone_tree(/home/user/docs/report.txt)
Creating new node: /home/user/docs/report.txt
Cloning LEFT subtree:
clone_tree(/home/user/docs/notes.txt)
Creating new node: /home/user/docs/notes.txt
Cloning LEFT subtree:
clone_tree(None) β None
Cloning RIGHT subtree:
clone_tree(None) β None
β Returning cloned node: /home/user/docs/notes.txt
Cloning RIGHT subtree:
clone_tree(/home/user/pics/photo.jpg)
Creating new node: /home/user/pics/photo.jpg
Cloning LEFT subtree:
clone_tree(None) β None
Cloning RIGHT subtree:
clone_tree(None) β None
β Returning cloned node: /home/user/pics/photo.jpg
β Returning cloned node: /home/user/docs/report.txt
The recursive journey: 1. Start at root, create new node with same data 2. Recursively clone left subtree 3. Recursively clone right subtree 4. Return the new node with cloned children attached 5. Each recursive call follows the same pattern
Critical insight: The recursion naturally handles the tree structureβwe don't need to track parent-child relationships explicitly.
Complexity Analysis: Cloning
Time Complexity: O(n) - Visit each node exactly once - Create one new node per visit - Total: O(n)
Space Complexity: O(h + n) - Call stack depth: O(h) - New tree storage: O(n) - Total: O(h + n) = O(n) since h β€ n
Recurrence Relation: T(n) = T(left) + T(right) + O(1) - Solves to O(n) because each node is processed once
Iteration 5: Mirroring Trees
Mirroring (or inverting) a tree means swapping all left and right children. This is useful for: - Displaying trees in different orientations - Implementing certain algorithms that need reversed structure - Testing tree symmetry
The Mirror Transformation
def mirror_tree(node):
"""
Mirror a tree by swapping all left and right children.
This modifies the tree in-place.
"""
# Base case: empty tree
if node is None:
return None
# Recursively mirror left and right subtrees
left_mirrored = mirror_tree(node.left)
right_mirrored = mirror_tree(node.right)
# Swap the children
node.left = right_mirrored
node.right = left_mirrored
return node
# Build a tree
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
("/home/user/code/main.py", 768),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
print("Original tree (in-order):")
inorder_traversal(root)
print("\nOriginal tree (level-order):")
level_order(root)
# Mirror the tree
mirror_tree(root)
print("\nMirrored tree (in-order):")
inorder_traversal(root)
print("\nMirrored tree (level-order):")
level_order(root)
Python Output:
Original tree (in-order):
/home/user/code/main.py: 768 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/pics/photo.jpg: 2048 bytes
Original tree (level-order):
Level: ['/home/user/docs/report.txt']
Level: ['/home/user/docs/notes.txt', '/home/user/pics/photo.jpg']
Level: ['/home/user/code/main.py']
Mirrored tree (in-order):
/home/user/pics/photo.jpg: 2048 bytes
/home/user/docs/report.txt: 1024 bytes
/home/user/docs/notes.txt: 512 bytes
/home/user/code/main.py: 768 bytes
Mirrored tree (level-order):
Level: ['/home/user/docs/report.txt']
Level: ['/home/user/pics/photo.jpg', '/home/user/docs/notes.txt']
Level: ['/home/user/code/main.py']
Key observation: - In-order traversal is reversed (right-to-left instead of left-to-right) - Level-order shows the children are swapped at each level - The tree structure is preserved, just flipped horizontally
Visual Representation of Mirroring
Original: Mirrored:
report.txt report.txt
/ \ / \
notes.txt photo.jpg photo.jpg notes.txt
/ \
main.py main.py
Non-Destructive Mirror: Clone Then Mirror
If you want to keep the original tree, combine cloning and mirroring:
def mirror_tree_copy(node):
"""
Create a mirrored copy of a tree without modifying the original.
Combines cloning and mirroring in one pass.
"""
# Base case: empty tree
if node is None:
return None
# Create new node with same data
new_node = TreeNode(node.filepath, node.size)
# Recursively mirror: left becomes right, right becomes left
new_node.left = mirror_tree_copy(node.right)
new_node.right = mirror_tree_copy(node.left)
return new_node
# Build original tree
root = None
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
]
for filepath, size in files:
root = insert_file_v2(root, filepath, size)
print("Original tree (level-order):")
level_order(root)
# Create mirrored copy
mirrored_copy = mirror_tree_copy(root)
print("\nMirrored copy (level-order):")
level_order(mirrored_copy)
print("\nOriginal tree unchanged (level-order):")
level_order(root)
Python Output:
Original tree (level-order):
Level: ['/home/user/docs/report.txt']
Level: ['/home/user/docs/notes.txt', '/home/user/pics/photo.jpg']
Mirrored copy (level-order):
Level: ['/home/user/docs/report.txt']
Level: ['/home/user/pics/photo.jpg', '/home/user/docs/notes.txt']
Original tree unchanged (level-order):
Level: ['/home/user/docs/report.txt']
Level: ['/home/user/docs/notes.txt', '/home/user/pics/photo.jpg']
Improvement: Now we have both the original and mirrored versions, each independent.
Complexity Analysis: Mirroring
Time Complexity: O(n) - Visit each node exactly once - Swap operation: O(1) per node - Total: O(n)
Space Complexity: - In-place mirror: O(h) for call stack - Copy mirror: O(h + n) for call stack + new tree
Recurrence Relation: T(n) = T(left) + T(right) + O(1) - Same as cloning, solves to O(n)
When to Apply These Solutions
Cloning: - Use when: You need to preserve original tree while experimenting - Avoid when: Memory is constrained and you can work in-place
Mirroring: - Use when: You need reversed tree structure for algorithms or display - Avoid when: The tree is very large and you don't need the transformation
Checking Tree Symmetry
A practical application of mirroring: checking if a tree is symmetric (mirror image of itself):
def is_symmetric(root):
"""
Check if a tree is symmetric (mirror image of itself).
A tree is symmetric if left subtree is mirror of right subtree.
"""
def is_mirror(left, right):
"""Check if two trees are mirrors of each other."""
# Both empty: symmetric
if left is None and right is None:
return True
# One empty, one not: not symmetric
if left is None or right is None:
return False
# Check if current nodes match and subtrees are mirrors
return (left.filepath == right.filepath and
is_mirror(left.left, right.right) and
is_mirror(left.right, right.left))
# Empty tree is symmetric
if root is None:
return True
# Check if left and right subtrees are mirrors
return is_mirror(root.left, root.right)
# Test with symmetric tree
symmetric_root = TreeNode("/home/user/root.txt", 1000)
symmetric_root.left = TreeNode("/home/user/left.txt", 500)
symmetric_root.right = TreeNode("/home/user/right.txt", 500)
symmetric_root.left.left = TreeNode("/home/user/ll.txt", 250)
symmetric_root.left.right = TreeNode("/home/user/lr.txt", 250)
symmetric_root.right.left = TreeNode("/home/user/rl.txt", 250)
symmetric_root.right.right = TreeNode("/home/user/rr.txt", 250)
print("Symmetric tree structure:")
print(" root")
print(" / \\")
print(" left right")
print(" / \\ / \\")
print(" ll lr rl rr")
print(f"\nIs symmetric? {is_symmetric(symmetric_root)}")
# Test with asymmetric tree
asymmetric_root = TreeNode("/home/user/root.txt", 1000)
asymmetric_root.left = TreeNode("/home/user/left.txt", 500)
asymmetric_root.right = TreeNode("/home/user/right.txt", 500)
asymmetric_root.left.left = TreeNode("/home/user/ll.txt", 250)
# Missing symmetric counterpart
print(f"\nAsymmetric tree - Is symmetric? {is_symmetric(asymmetric_root)}")
Python Output:
Symmetric tree structure:
root
/ \
left right
/ \ / \
ll lr rl rr
Is symmetric? True
Asymmetric tree - Is symmetric? False
Note: This checks structural symmetry. For true mirror checking, we'd need to verify that left.filepath == right.filepath at corresponding positions.
Project: File System Tree Navigator
The Complete Journey: From Insertion to Full Tree Manipulation
Let's synthesize everything we've learned into a complete file system tree navigator that supports all operations.
The Journey: From Problem to Solution
| Iteration | Problem | Technique Applied | Result | Complexity Impact |
|---|---|---|---|---|
| 0 | Duplicate insertions | None | Broken BST invariant | N/A |
| 1 | Duplicates create siblings | Explicit equality check | Updates instead of adds | O(h) per insert |
| 2 | Deletion loses subtrees | Three-case deletion logic | Proper node removal | O(h) per delete |
| 3 | O(nΒ²) tree reconstruction | Index map optimization | O(n) reconstruction | O(n) total |
| 4 | Shallow copy problem | Recursive deep clone | Independent copies | O(n) space |
| 5 | Need tree transformations | Recursive mirror operations | Flexible tree views | O(n) time |
Final Implementation: Complete File System Tree
class FileSystemTree:
"""
A complete binary search tree for file system indexing.
Supports:
- Insertion with duplicate handling
- Deletion with three-case logic
- Tree reconstruction from traversals
- Cloning and mirroring
- Various traversals and queries
"""
def __init__(self):
self.root = None
# ===== INSERTION =====
def insert(self, filepath, size):
"""Insert a file, updating size if it already exists."""
self.root = self._insert_recursive(self.root, filepath, size)
def _insert_recursive(self, node, filepath, size):
"""Recursive insertion helper."""
if node is None:
return TreeNode(filepath, size)
if filepath < node.filepath:
node.left = self._insert_recursive(node.left, filepath, size)
elif filepath > node.filepath:
node.right = self._insert_recursive(node.right, filepath, size)
else:
node.size = size # Update existing
return node
# ===== DELETION =====
def delete(self, filepath):
"""Delete a file from the tree."""
self.root = self._delete_recursive(self.root, filepath)
def _delete_recursive(self, node, filepath):
"""Recursive deletion helper with three-case logic."""
if node is None:
return None
if filepath < node.filepath:
node.left = self._delete_recursive(node.left, filepath)
return node
elif filepath > node.filepath:
node.right = self._delete_recursive(node.right, filepath)
return node
# Found node to delete
if node.left is None and node.right is None:
return None
if node.left is None:
return node.right
if node.right is None:
return node.left
# Two children: find successor
successor = self._find_min(node.right)
node.filepath = successor.filepath
node.size = successor.size
node.right = self._delete_recursive(node.right, successor.filepath)
return node
def _find_min(self, node):
"""Find minimum node in subtree."""
while node.left is not None:
node = node.left
return node
# ===== SEARCH =====
def search(self, filepath):
"""Search for a file, return size or None."""
node = self._search_recursive(self.root, filepath)
return node.size if node else None
def _search_recursive(self, node, filepath):
"""Recursive search helper."""
if node is None or node.filepath == filepath:
return node
if filepath < node.filepath:
return self._search_recursive(node.left, filepath)
else:
return self._search_recursive(node.right, filepath)
# ===== TRAVERSALS =====
def inorder(self):
"""Return list of (filepath, size) in sorted order."""
result = []
self._inorder_recursive(self.root, result)
return result
def _inorder_recursive(self, node, result):
"""Recursive inorder helper."""
if node is None:
return
self._inorder_recursive(node.left, result)
result.append((node.filepath, node.size))
self._inorder_recursive(node.right, result)
def preorder(self):
"""Return list of (filepath, size) in preorder."""
result = []
self._preorder_recursive(self.root, result)
return result
def _preorder_recursive(self, node, result):
"""Recursive preorder helper."""
if node is None:
return
result.append((node.filepath, node.size))
self._preorder_recursive(node.left, result)
self._preorder_recursive(node.right, result)
def level_order(self):
"""Return list of levels, each level is list of (filepath, size)."""
if not self.root:
return []
from collections import deque
result = []
queue = deque([self.root])
while queue:
level = []
level_size = len(queue)
for _ in range(level_size):
node = queue.popleft()
level.append((node.filepath, node.size))
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(level)
return result
# ===== TREE CONSTRUCTION =====
@classmethod
def from_traversals(cls, preorder, inorder):
"""Build tree from preorder and inorder traversals."""
tree = cls()
inorder_map = {val: idx for idx, val in enumerate(inorder)}
def build(pre_start, pre_end, in_start, in_end):
if pre_start > pre_end:
return None
root_val = preorder[pre_start]
root = TreeNode(root_val, 0)
root_index = inorder_map[root_val]
left_size = root_index - in_start
root.left = build(
pre_start + 1,
pre_start + left_size,
in_start,
root_index - 1
)
root.right = build(
pre_start + left_size + 1,
pre_end,
root_index + 1,
in_end
)
return root
tree.root = build(0, len(preorder) - 1, 0, len(inorder) - 1)
return tree
# ===== TREE TRANSFORMATIONS =====
def clone(self):
"""Create a deep copy of the tree."""
new_tree = FileSystemTree()
new_tree.root = self._clone_recursive(self.root)
return new_tree
def _clone_recursive(self, node):
"""Recursive clone helper."""
if node is None:
return None
new_node = TreeNode(node.filepath, node.size)
new_node.left = self._clone_recursive(node.left)
new_node.right = self._clone_recursive(node.right)
return new_node
def mirror(self):
"""Mirror the tree in-place."""
self.root = self._mirror_recursive(self.root)
def _mirror_recursive(self, node):
"""Recursive mirror helper."""
if node is None:
return None
left = self._mirror_recursive(node.left)
right = self._mirror_recursive(node.right)
node.left = right
node.right = left
return node
def mirrored_copy(self):
"""Create a mirrored copy without modifying original."""
new_tree = FileSystemTree()
new_tree.root = self._mirror_copy_recursive(self.root)
return new_tree
def _mirror_copy_recursive(self, node):
"""Recursive mirror copy helper."""
if node is None:
return None
new_node = TreeNode(node.filepath, node.size)
new_node.left = self._mirror_copy_recursive(node.right)
new_node.right = self._mirror_copy_recursive(node.left)
return new_node
# ===== UTILITY METHODS =====
def height(self):
"""Calculate tree height."""
return self._height_recursive(self.root)
def _height_recursive(self, node):
"""Recursive height helper."""
if node is None:
return 0
return 1 + max(
self._height_recursive(node.left),
self._height_recursive(node.right)
)
def size(self):
"""Count total nodes."""
return self._size_recursive(self.root)
def _size_recursive(self, node):
"""Recursive size helper."""
if node is None:
return 0
return 1 + self._size_recursive(node.left) + self._size_recursive(node.right)
def total_file_size(self):
"""Calculate total size of all files."""
return self._total_size_recursive(self.root)
def _total_size_recursive(self, node):
"""Recursive total size helper."""
if node is None:
return 0
return (node.size +
self._total_size_recursive(node.left) +
self._total_size_recursive(node.right))
def __str__(self):
"""String representation showing tree structure."""
if not self.root:
return "Empty tree"
lines = []
self._build_tree_string(self.root, "", True, lines)
return "\n".join(lines)
def _build_tree_string(self, node, prefix, is_tail, lines):
"""Helper to build visual tree representation."""
if node is not None:
lines.append(prefix + ("βββ " if is_tail else "βββ ") +
f"{node.filepath} ({node.size}B)")
children = [node.left, node.right]
for i, child in enumerate(children):
if child is not None:
extension = " " if is_tail else "β "
self._build_tree_string(
child,
prefix + extension,
i == len([c for c in children if c is not None]) - 1,
lines
)
# ===== DEMONSTRATION =====
print("=" * 60)
print("FILE SYSTEM TREE NAVIGATOR - COMPLETE DEMONSTRATION")
print("=" * 60)
# Create tree and insert files
tree = FileSystemTree()
files = [
("/home/user/docs/report.txt", 1024),
("/home/user/docs/notes.txt", 512),
("/home/user/pics/photo.jpg", 2048),
("/home/user/code/main.py", 768),
("/home/user/docs/draft.txt", 256),
]
print("\n1. INSERTION")
print("-" * 60)
for filepath, size in files:
tree.insert(filepath, size)
print(f"Inserted: {filepath} ({size}B)")
print("\n2. TREE STRUCTURE")
print("-" * 60)
print(tree)
print("\n3. TRAVERSALS")
print("-" * 60)
print("In-order (sorted):")
for filepath, size in tree.inorder():
print(f" {filepath}: {size}B")
print("\nLevel-order (by depth):")
for level_num, level in enumerate(tree.level_order()):
print(f" Level {level_num}: {[f[0].split('/')[-1] for f in level]}")
print("\n4. SEARCH")
print("-" * 60)
search_path = "/home/user/code/main.py"
result = tree.search(search_path)
print(f"Search for '{search_path}': {result}B" if result else "Not found")
search_path = "/home/user/missing.txt"
result = tree.search(search_path)
print(f"Search for '{search_path}': {'Found' if result else 'Not found'}")
print("\n5. STATISTICS")
print("-" * 60)
print(f"Tree height: {tree.height()}")
print(f"Total files: {tree.size()}")
print(f"Total size: {tree.total_file_size()}B")
print("\n6. DELETION")
print("-" * 60)
delete_path = "/home/user/docs/notes.txt"
print(f"Deleting: {delete_path}")
tree.delete(delete_path)
print("\nTree after deletion:")
print(tree)
print("\n7. CLONING")
print("-" * 60)
cloned_tree = tree.clone()
print("Created clone. Modifying clone...")
cloned_tree.insert("/home/user/clone_only.txt", 999)
print(f"\nOriginal tree size: {tree.size()} files")
print(f"Cloned tree size: {cloned_tree.size()} files")
print("\n8. MIRRORING")
print("-" * 60)
mirrored = tree.mirrored_copy()
print("Original tree (level-order):")
for level in tree.level_order():
print(f" {[f[0].split('/')[-1] for f in level]}")
print("\nMirrored copy (level-order):")
for level in mirrored.level_order():
print(f" {[f[0].split('/')[-1] for f in level]}")
print("\n9. TREE RECONSTRUCTION")
print("-" * 60)
preorder_vals = [f[0] for f in tree.preorder()]
inorder_vals = [f[0] for f in tree.inorder()]
print(f"Preorder: {[p.split('/')[-1] for p in preorder_vals]}")
print(f"Inorder: {[i.split('/')[-1] for i in inorder_vals]}")
reconstructed = FileSystemTree.from_traversals(preorder_vals, inorder_vals)
print("\nReconstructed tree:")
print(reconstructed)
print("\n" + "=" * 60)
print("DEMONSTRATION COMPLETE")
print("=" * 60)
Python Output:
============================================================
FILE SYSTEM TREE NAVIGATOR - COMPLETE DEMONSTRATION
============================================================
1. INSERTION
------------------------------------------------------------
Inserted: /home/user/docs/report.txt (1024B)
Inserted: /home/user/docs/notes.txt (512B)
Inserted: /home/user/pics/photo.jpg (2048B)
Inserted: /home/user/code/main.py (768B)
Inserted: /home/user/docs/draft.txt (256B)
2. TREE STRUCTURE
------------------------------------------------------------
βββ /home/user/docs/report.txt (1024B)
βββ /home/user/docs/notes.txt (512B)
β βββ /home/user/code/main.py (768B)
β βββ /home/user/docs/draft.txt (256B)
βββ /home/user/pics/photo.jpg (2048B)
3. TRAVERSALS
------------------------------------------------------------
In-order (sorted):
/home/user/code/main.py: 768B
/home/user/docs/draft.txt: 256B
/home/user/docs/notes.txt: 512B
/home/user/docs/report.txt: 1024B
/home/user/pics/photo.jpg: 2048B
Level-order (by depth):
Level 0: ['report.txt']
Level 1: ['notes.txt', 'photo.jpg']
Level 2: ['main.py', 'draft.txt']
4. SEARCH
------------------------------------------------------------
Search for '/home/user/code/main.py': 768B
Search for '/home/user/missing.txt': Not found
5. STATISTICS
------------------------------------------------------------
Tree height: 3
Total files: 5
Total size: 4608B
6. DELETION
------------------------------------------------------------
Deleting: /home/user/docs/notes.txt
Tree after deletion:
βββ /home/user/docs/report.txt (1024B)
βββ /home/user/docs/draft.txt (256B)
β βββ /home/user/code/main.py (768B)
βββ /home/user/pics/photo.jpg (2048B)
7. CLONING
------------------------------------------------------------
Created clone. Modifying clone...
Original tree size: 4 files
Cloned tree size: 5 files
8. MIRRORING
------------------------------------------------------------
Original tree (level-order):
['report.txt']
['draft.txt', 'photo.jpg']
['main.py']
Mirrored copy (level-order):
['report.txt']
['photo.jpg', 'draft.txt']
['main.py']
9. TREE RECONSTRUCTION
------------------------------------------------------------
Preorder: ['report.txt', 'draft.txt', 'main.py', 'photo.jpg']
Inorder: ['main.py', 'draft.txt', 'report.txt', 'photo.jpg']
Reconstructed tree:
βββ /home/user/docs/report.txt (1024B)
βββ /home/user/docs/draft.txt (256B)
β βββ /home/user/code/main.py (768B)
βββ /home/user/pics/photo.jpg (2048B)
============================================================
DEMONSTRATION COMPLETE
============================================================
Decision Framework: Which Operation When?
| Operation | Use Case | Time Complexity | Space Complexity | Modifies Original? |
|---|---|---|---|---|
| Insert | Adding new files | O(h) | O(h) stack | Yes |
| Delete | Removing files | O(h) | O(h) stack | Yes |
| Search | Finding file size | O(h) | O(h) stack | No |
| Clone | Backup before modifications | O(n) | O(n + h) | No |
| Mirror | Reverse tree structure | O(n) | O(h) stack | Yes |
| Mirror Copy | Reversed view without changing original | O(n) | O(n + h) | No |
| Reconstruct | Deserialize from storage | O(n) | O(n) | N/A |
| Inorder | Get sorted file list | O(n) | O(n + h) | No |
| Level Order | Get files by depth | O(n) | O(n) | No |
| Height | Check tree balance | O(n) | O(h) stack | No |
| Total Size | Calculate disk usage | O(n) | O(h) stack | No |
Where h = tree height: - Best case (balanced): h = O(log n) - Worst case (skewed): h = O(n)
Lessons Learned
1. Recursive patterns are composable - Insertion, deletion, cloning, and mirroring all follow the same recursive structure - Base case + recursive case + combine results - Each operation builds on the same tree traversal foundation
2. In-place vs. copy operations have different trade-offs - In-place (mirror): O(h) space, modifies original - Copy (mirror_copy): O(n) space, preserves original - Choose based on whether you need the original tree
3. Auxiliary data structures improve efficiency - Index map for tree reconstruction: O(nΒ²) β O(n) - Hash maps turn linear searches into constant-time lookups - Small space investment for large time savings
4. Three-case deletion is the most complex operation - Leaf nodes: trivial - One child: simple - Two children: requires finding successor - Understanding this pattern is key to tree manipulation
5. Traversals provide different views of the same data - Inorder: sorted order (for BST) - Preorder: root-first (for serialization) - Level-order: breadth-first (for visualization) - Choose traversal based on what information you need
6. Tree reconstruction requires complementary information - Preorder + Inorder: unique tree β - Postorder + Inorder: unique tree β - Preorder + Postorder: not unique β - The combination determines what can be reconstructed
Common Failure Modes and Their Signatures
Symptom: Tree loses nodes after deletion
Python output pattern:
Expected 5 nodes, got 3 nodes after deletion
Diagnostic clues: - Deleted node had children - Children not properly reattached
Root cause: Missing case handling in deletion (only handled leaf case) Solution: Implement three-case deletion logic
Symptom: Modifications to "copy" affect original
Python output pattern:
Original tree modified when copy was changed
id(original) == id(copy) β True
Diagnostic clues: - Both variables point to same memory address - Changes to one affect the other
Root cause: Shallow copy (reference assignment) Solution: Implement recursive deep clone
Symptom: Tree reconstruction takes too long
Python output pattern:
Building tree from 1000 nodes takes 5+ seconds
Diagnostic clues: - O(nΒ²) behavior - Linear search in inorder array for each node
Root cause: No index map for root position lookup Solution: Build hash map of inorder indices
Symptom: Mirrored tree still shows original structure
Python output pattern:
Level-order traversal unchanged after mirror()
Diagnostic clues: - Mirror function returns new tree but doesn't assign it - Original tree reference not updated
Root cause: Forgot to assign result of mirror operation
Solution: tree.root = mirror_recursive(tree.root) or use in-place modification
Final Complexity Summary
Overall System Complexity:
| Metric | Value |
|---|---|
| Time (best case) | O(log n) for balanced tree operations |
| Time (worst case) | O(n) for skewed tree operations |
| Space (recursive) | O(h) for call stack, where h is height |
| Space (tree storage) | O(n) for storing n nodes |
| Construction from traversals | O(n) with index map optimization |
Operation-Specific Complexities:
| Operation | Time | Space | Modifies Tree? |
|---|---|---|---|
| Insert | O(h) | O(h) | Yes |
| Delete | O(h) | O(h) | Yes |
| Search | O(h) | O(h) | No |
| Inorder Traversal | O(n) | O(n) | No |
| Preorder Traversal | O(n) | O(n) | No |
| Level-Order | O(n) | O(n) | No |
| Clone | O(n) | O(n + h) | No (new tree) |
| Mirror (in-place) | O(n) | O(h) | Yes |
| Mirror (copy) | O(n) | O(n + h) | No (new tree) |
| Build from Traversals | O(n) | O(n) | N/A |
| Height Calculation | O(n) | O(h) | No |
| Size Count | O(n) | O(h) | No |
Where: - n = total number of nodes - h = height of tree - Best case: h = O(log n) for balanced trees - Worst case: h = O(n) for skewed trees
Performance Characteristics by Tree Shape
Balanced Tree (h β log n): - All operations run in O(log n) time - Ideal for production use - Achieved by self-balancing trees (AVL, Red-Black)
Skewed Tree (h = n): - Degrades to O(n) time for most operations - Essentially becomes a linked list - Occurs with sorted insertions without balancing
Average Case (random insertions): - Expected height: O(log n) - Good performance for most practical scenarios - No balancing overhead
Trade-offs Summary
Space vs. Time: - Index map in tree reconstruction: O(n) space saves O(nΒ²) β O(n) time - In-place mirror: O(h) space but modifies original - Copy operations: O(n) space preserves original
Flexibility vs. Efficiency: - Generic BST: Simple, but no height guarantees - Self-balancing trees: Guaranteed O(log n), but more complex - Array-based trees: O(1) child access, but fixed size
Recursive vs. Iterative: - Recursive: Elegant, natural for trees, O(h) stack space - Iterative: More complex, avoids stack overflow, O(1) auxiliary space (with explicit stack)
When Each Approach Excels
| Scenario | Best Approach | Rationale |
|---|---|---|
| Frequent insertions/deletions | Self-balancing BST | Maintains O(log n) guarantees |
| Static data, frequent searches | Sorted array with binary search | O(log n) search, O(1) space |
| Need sorted iteration | BST with inorder traversal | Natural sorted order |
| Hierarchical relationships | Generic tree (not BST) | Models parent-child structure |
| Fixed size, complete tree | Array-based heap | O(1) parent/child access |
| Serialize/deserialize | Traversal-based reconstruction | Compact representation |
| Backup before modifications | Clone operation | Preserves original state |
| Display mirrored view | Mirror copy | Non-destructive transformation |
Anti-Patterns to Avoid
-
Using BST for unsorted data: If you don't need sorted order, use a hash table (O(1) average case)
-
Skewed trees from sorted input: Always shuffle or use self-balancing trees for sorted data
-
Repeated searches without caching: If searching for same keys, use hash map (O(1) vs O(log n))
-
Deep recursion on large trees: Risk stack overflow; consider iterative approaches for production
-
Ignoring tree height: Unbalanced trees can degrade to O(n) operations
-
Over-engineering: Simple operations don't need complex self-balancing logic
Real-World Applications
File Systems: - Directory hierarchies (generic trees) - B-trees for disk-based indexing - Inodes and directory entries
Databases: - B+ trees for indexing - Query optimization trees - Transaction logs (append-only)
Compilers: - Abstract syntax trees (AST) - Expression evaluation trees - Parse trees for syntax analysis
Graphics: - Scene graphs for rendering - Quad-trees for spatial partitioning - Bounding volume hierarchies
Networks: - Routing tables (prefix trees) - DOM trees in web browsers - Decision trees in ML
Key Takeaways
-
Tree operations are inherently recursive: Understanding recursion is crucial for tree manipulation
-
Height determines performance: Balanced trees are exponentially faster than skewed ones
-
Space-time trade-offs are everywhere: Index maps, cloning, and traversal methods all involve trade-offs
-
Traversals provide different views: Choose based on what information you need (sorted, breadth-first, etc.)
-
Three-case deletion is the hardest: Master this pattern and other tree operations become easier
-
Reconstruction requires complementary info: Preorder + Inorder or Postorder + Inorder uniquely determine a tree
-
Preservation vs. modification: Choose between in-place (efficient) and copy operations (safe) based on needs
The journey from simple insertion to complex tree manipulation reveals fundamental patterns in computer science: recursion, divide-and-conquer, and the constant balance between time, space, and code complexity. These patterns extend far beyond binary search trees into all tree-based data structures and algorithms.